Voting Between Multiple Data Representations for Text Chunking
نویسندگان
چکیده
This paper considers the hypothesis that voting between multiple data representations can be more accurate than voting between multiple learning models. This hypothesis has been considered before (cf. [San00]) but the focus was on voting methods rather than the data representations. In this paper, we focus on choosing specific data representations combined with simple majority voting. On the community standard CoNLL-2000 data set, using no additional knowledge sources apart from the training data, we achieved 94.01 Fβ=1 score for arbitrary phrase identification compared to the previous best Fβ=1 93.90. We also obtained 95.23 Fβ=1 score for Base NP identification. Significance tests show that our Base NP identification score is significantly better than the previous comparable best Fβ=1 score of 94.22. Our main contribution is that our model is a fast linear time approach and the previous best approach is significantly slower than our system.
منابع مشابه
Artificial General Segmentation
We argue that the ability to find meaningful chunks in sequential input is a core cognitive ability for artificial general intelligence, and that the Voting Experts algorithm, which searches for an information theoretic signature of chunks, provides a general implementation of this ability. In support of this claim, we demonstrate that VE successfully finds chunks in a wide variety of domains, ...
متن کاملChunking with Support Vector Machines
We apply Support Vector Machines (SVMs) to identify English base phrases (chunks). SVMs are known to achieve high generalization performance even with input data of high dimensional feature spaces. Furthermore, by the Kernel principle, SVMs can carry out training with smaller computational overhead independent of their dimensionality. We apply weighted voting of 8 SVMsbased systems trained with...
متن کاملBootstrap Voting Experts
BOOTSTRAP VOTING EXPERTS (BVE) is an extension to the VOTING EXPERTS algorithm for unsupervised chunking of sequences. BVE generates a series of segmentations, each of which incorporates knowledge gained from the previous segmentation. We show that this method of bootstrapping improves the performance of VOTING EXPERTS in a variety of unsupervised word segmentation scenarios, and generally impr...
متن کاملWord Segmentation as General Chunking
During language acquisition, children learn to segment speech into phonemes, syllables, morphemes, and words. We examine word segmentation specifically, and explore the possibility that children might have generalpurpose chunking mechanisms to perform word segmentation. The Voting Experts (VE) and Bootstrapped Voting Experts (BVE) algorithms serve as computational models of this chunking abilit...
متن کاملLayered Mereotopology
BOOTSTRAP VOTING EXPERTS (BVE) is an extension to the VOTING EXPERTS algorithm for unsupervised chunking of sequences. BVE generates a series of segmentations, each of which incorporates knowledge gained from the previous segmentation. We show that this method of bootstrapping improves the performance of VOTING EXPERTS in a variety of unsupervised word segmentation scenarios, and generally impr...
متن کامل